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We describe quantitatively a RNA molecule under the influence of an external force exerted 
at its two ends as in a typical single-molecule experiment. Our calculation incorporates the 
interactions between nucleotides by using the experimentally-determined free energy rules for 
RNA secondary structure and models the polymeric properties of the exterior single-stranded 
regions explicitly as elastic freely-jointed chains. We find that in spite of complicated secondary 
structures, force-extension curves are typically smooth in quasi-equilibrium. We identify and 
characterize two sequence/structure-dependent mechanisms that, in addition to the sequence- 
independent entropic elasticity of the exterior single-stranded regions, are responsible for the 
smoothness. These involve compensation between different structural elements on which the 
external force acts simultaneously, and contribution of suboptimal structures, respectively. We 
estimate how many features a force-extension curve recorded in non-equilibrium, where the pulling 
proceeds faster than rearrangements in the secondary structure of the molecule, could show in 
principle. Our software is available to the public through a 'RNA-pulling server'. 



I. INTRODUCTION 



Ir | recent years, single-molecule experiments employ- 



ing optical tweezers, atomic force microscopy, and other 
techniques have successfully probed basic physical prop- 
erties of biomolecules throu gh the application of force s 
in the pN range (see, e.g., | Bockelmann et al. ( 1 9971) 



Essevaz-Roulct et al (19971); iMehta et al. (19991); Rid 
et a ll. (19971, |1999|); |Smith et al. (1996|); |Yang (2 00(F) 



Both, simple elastic properties of the polymers (such as 
persistence length and longitudinal elasticity) and struc- 
tural transitions (e.g. unfolding of protein domains) were 
characterized by recording and analyzing force-extension 
curves (FEC's). For nucleic acids, a prominent exper- 
iment of the l atter type is the 'unzipping' of double- 
stranded DNA ( Bockelmann et al., 1997| ; Esscvaz-Roulet 
et a 1997). The resulting FEC's display clear sequence- 



specific features (e.g. local maxima), which may be at- 



strongly bound than their neighbors (Esscvaz-Roulet et 


al., 
Sigg 


1997 




Lubensky and Nelson, 2000|; Thompson and 


ia, 1995 


). In contrast, long single-stranded DNA, 



which, like RNA, may fold into complicated branched 
structures by forming intra-strand basepairs, showed ex- 
tremely smooth FEC's in a very recent experiment by 
Maier et al. (2000j ). Thus, depending on its structure, 



DNA may show a broad range of FEC's from very rugged 
to completely featureless. However, it is unclear how 
quantitatively the structure determines the outcome of 
the FEC measurement. 

Here, we address this question theoretically, focus- 
ing on the case of RNA and restricting ourselves to 
secondary structure (i.e. basepairing patterns only in- 
stead of full, tertiary structure). In this context, RNA 
seems to be a more interesting object than DNA, since 
RNA naturally occurs in many different and function- 
ally important structures, while DNA is primarily found 
as a double strand. One may hope that pulling experi- 
ments generate new insights into the RNA folding prob- 



lem (Tinoco and Bustamante, 1999), including the fold 



ing pathways (Chen and Dill, 2000 ; [sambert and Sig- 
gia, 2000| ; Thirumalai and Woodson, 2000). Also, force- 



induced denaturation of RNA is currently studied experi- 
mentally (C. Bustamante and I. Tinoco, private commu- 
niation). The limitation to secondary structure allows 
us to draw upon the experimentally deter mined 'free en- 



ergy rules' for RNA secondary structure ( Freier et al 
19861 ; |Mathews et al., 1999| ; |Walter et al., 1994| ), which 



yield minimum free energy structures that agree rea- 
sonably well with experimentally and phylogenetically 
determined ones (Mathews et al., 1999). Furthermore, 
it permits us to employ and extend the efficient dy- 
namic programm i ng algorithms (|Hofacker et al., 1994 



McCaskifl, 199C| ; |Zuker and Stiegler, 1981| ) which can 
compute the exact partition function (including all pos- 
sible secondary structures) and reconstruct the minimal 
free energy structures in polynomial time. Experimen- 
tally, the secondary structures may be probed in specific 
ionic conditions (e.g., those with only monovalent ions) 
such that the tertiary contacts are strongly disfavored 
(due to ele ctrostatic repulsion of the su gar-phosphate 
backbone) (Tinoco and Bustamante, 1999). 

The type of experiment that we consider is sketched in 
Fig. [|. The distance R between the two ends of an RNA 
molecule is held fixed, e.g., by attaching them to two 
beads whose positions are controlled by optical tweez- 
ers, and the force / acting on the beads is recorded 
as a function of R. As long as the external change in 
force/extension is applied at a much slower time scale 
than that of structural transitions of the molecule, the 
equilibrium FEC is measured. In the main part of the 
present article, we assume that this is always the case. 
Experimentally, this condition is usually checked by re- 
tracing the FEC (e.g., a hysteresis effect is a clear sign 
of a non-equilibrium situation). 

Besides the above-mentioned free energy parameters 
for RNA secondary structure, we need a polymer model 
for single-stranded RNA as input in order to make quan- 
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FIG. 1 Sketch of the pulling experiment considered in the 
text: the two ends of a RNA molecule are attached to beads 
(shaded gray) and held fixed at distance R, while the force / 
acting on the beads is measured. The open circles represent 
the open bases of the exterior single strands, modeled here as 
elastic freely jointed chains. 



rium and non-equilibrium FEC's, where the pulling pro- 
ceeds faster than (some of) the rearrangements in the 
structure. While the present approach is extended read- 
ily to include equilibrium fluctuations (Gcrland, U., R. 
Bundschuh, and T. Hwa, in preparation), a quantitative 
treatment of the dynamics of force-induced denaturation 
of RNA presents a challenge to theoreticians. 

The organization of the paper is as follows. In the 
next section, we explain the details of our model and the 
way we calculate the FEC's. Readers interested in the 
results only should directly proceed to section III. The 
discussion in section IV explores the possibility of using 
experimental FEC's of appropriately designed sequences 
as an alternative way to determine the RNA free energy 
parameters. In addition, we estimate to what extent fea- 
tures may be expected in non-equilibrium FEC's. 

II. MODEL AND METHODS 



titative predictions of FEC's. To that end, we employ an 
elastic freely jointed chain model, which has been used 



to fi t experimental FEC' s of single-stranded DNA ( Mon- 
tan4ri and Mezard, 2000|; |Smith et al., 1996|). This intro- 



duces two polymer parameters, the Kuhn length charac- 
terizing the lateral rigidity, and the longitudinal elastic- 
ity, which is determined by the forces needed to stretch 
the chemical structure of the backbone. We estimate 
both from the experiments on DNA, so that we are left 
with no free parameters. 

We find that for different secondary structures with 
all other parameters (temperature, sequence length, etc.) 
fixed, the FEC's of RNA vary over a broad range from 
very rugged to very smooth. Apart from the entropic 
elasticity of the exterior single strand, which smoothcns 
the features in the FEC indepen dent of the secondary 
structure as already discussed by Thompson and Siggia 
(199^), there are two additional smoothing mechanisms. 
The first is a 'compensation effect': the increase in the 
length of the exterior single strand upon opening of a 
structural element and the associated drop in the tension 
may be absorbed by rebinding of bases from the exterior 
single strand in other structural elements. The second is 
due to thermal fluctuations in the secondary structure, 
i.e. the contribution of suboptimal structures. We dis- 
cuss both mechanisms and analyze the fluctuations in the 
FEC quantitatively. The equilibrium FEC's of typical 
(natural or random) RNA sequences are smooth and dis- 
play no distinguishable signatures of individual structural 
elements opening. This is consistent with the experimen- 
tal result of [Maier ct al. (2000[ ) for single-stranded DNA, 
but applies even for sequences with only a few hundred 
nucleotides, i.e. for much shorter sequences than used in 
their experiment. 

For the purpose of obtaining information on the struc- 
ture of RNA, the measurement of equilibrium FEC's is 
therefore not very useful. More promising options include 
the measurement of the fluctuations about the equilib- 



We assume that the force f(R) acting on the beads 
(see Fig. |l|) is measured as a function of the fixed dis- 
tance R = |R|, where R denotes the end-to-end vector 
of the RNA molecule, and that R is varied very slowly 
so that thermal equilibrium is always maintained. In 
practice, the force measurement requires a device acting 
as a spring, hence the distance cannot be kept exactly 
constant. However, we consider the situation where the 
stiffness of this spring is much higher than that of the 
single-stranded RNA, which has already been pulled out. 
This condition could only be violated in the very early 
part of the pulling experiment, which is not the focus 
of the present investigation. We may therefore neglect 
the presence of the spring altogether, which amounts to 
working in the 'fixed-distance ensemble' 1 . Another differ- 
ence between our model and actual experiments is that 
we neglect the presence of additional spacer sequences, 
which are used to connect the RNA molecule to the force- 
measuring device (e.g. the beads). Again, we assume 
that they are stiller than the liberated single-stranded 
RNA, since we are interested in the size of the features 
in the FEC, which are observable in an ideal measure- 
ment. 

The partition function at fixed extension, Zn(R), for 
a given RNA sequence consisting of N nucleotides, may 
be written as a sum over the number m of exterior open 
bases (as represented by open circles in Fig. [I]) . For each 
m, the secondary structure contributes a factor Qj\r(m) 



1 In the 'fixed-distance ensemble', only the average force is well- 
defined, whereas the fluctuations about the average diverge. This 
reflects the fact that it takes increasingly higher forces to com- 
pensate thermal fluctuations on shorter and shorter timescales, 
in order to keep the extension exactly fixed. Therefore, if one is 
interested in the fluctuations (of either the force or the exten- 
sion), the external spring should not be neglected, which would 
amount to working in a mixed ensemble between 'fixed-distance' 
and 'fixed-force'. 
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to the partition function, according to the free energy 
rules for RNA/DNA secondary structure to be detailed 
shortly below. This contribution needs to be weighted 
by the probability W(R; m) that the chain of m exterior 
open bases has end-to-end vector R, given by an appro- 
priate polymer model for the single strand. Together, 
they yield 



the condition that the exterior part of the configurations 
is < n < j bases long. The recursion formula for Q is 2 



Z N (R) = J2 QN(m)W(R;m) 



(1) 



The normalization J d 3 R W(R; m) — 1 assures that the 
integral of Zm{R) over space yields the usual partition 
function Zn for N nucleotides without any external 
constraints. Eq. (|l|) clearly separates the contribution 
of the secondary structure, which is entirely contained 
in Qat(to), from the contribution of the exterior single 
strand contained in W(R;to). Note that the polymer 
properties of the interior single strands (i.e. the single 
strands not subject to the external force) are contained in 
QAr(m) through the loop-entropy parameters, which are 
part of the free energy rules derived from experiments 
(see Walter et aL (1994) and references therein). 

Secondary structure. The number of possible sec- 
ondary structures for a given sequence of length N grows 
exponentially with N. To each structure S, a Boltz- 
mann weight £(<S) ma y be assigned with the help of 
the free energy rules ( Walter et aL, 1994 ) which con- 



Q(7+l;n)=Q(j;n-l)+2^0(*-l;n-A)n(*,j + l), 

i=n— A+l 

obtained by splitting the partition function Q(J + l;ri) up 
according to all possible binding partners of base j + 1. 
This formula, together with the appropriate boundary 
conditions for ,7 = and n — 0, can be solved recursively 
by calculating Q{j\ n) first for all n at a given j and then 
for increasing j. In the end, we have Qat(to) = Q(N;m) 
for the m exterior bases in 0(N 3 ) time. 

To produce the minimum free energy structures at 
fixed m, we use an equivalent recursive scheme, but 
replacing the summations by maxima to obtain first 
the minimum free energy (Zuker and Stiegler, 1981). 



Then, we determine the corresponding structure by go- 
ing through the scheme in reverse and reconstructing at 
each step which of the terms was maximal. 

Polymer model. The simplest polymer model for the 
exterior single st rand (the open c ircles in Fig. is the 
Gaussian chain ( de Gennes, 1979 ). However, as shown 
below, the force-induced denaturation of RNA occurs at 
forces of order 10 pN, where the exterior single strand is 
strongly stretched and the Gaussian model breaks down. 
In this regime, an elastic freely jointed chain ( EFJC) 



tainl a large number ot experimentally determined en- 
ergy and enthalpy parameters, e.g., those tor the stack- 
ing of basepairs, formation of internal, hairpin, bulge 
or multi-loops, and dangling ends. Due to the large 
number of possible structures, the full partition func- 
tion Zn = J2s C(S) is impossible to evaluate by enu- 
meration, except for very small N. However, one can 
make use of recursion relations that express the partition 
function for a subsequence with the hel p of the partition 
functions for even shorter subsequences flMcCaskill, 1990 



model 3 yields a good fit to experimental F EC's (Mon- 
tanari and Mczard, 2000|; [Smith et aL, 1996|) 



Zuker and Stiegler, 1981 ), and proceed to compute the 
full partition function exactly in 0(N 3 ) time. These re- 
cursion relations owe their existence to the fact that the 
class of secondary structures was defined to include only 
nested structures, e.g. two basepairs and (k,l) with 
i < k < j < I are not admitted (the occurrence of such 
pairings is called a pseudoknot and contributes relatively 
littl e to the free e nergy of natural RNAs (Tinoco and 
Bus amante, 1999| )). One implementation of this algo- 
rithm wit h very detailed free ene rgy rules is the 'Vienna 
package' (|Hofacker et aL, 1994 , publically available at 
http : / / www. tbi . univie . ac . at / ) . In the following, we de- 
scribe the modifications that we made to this package in 
order to obtain Qjv(m) and the corresponding minimum 
free energy structures. 

The Vienna package calculates the auxiliary partition 
function H(i,j) for the substrand (i.e., a contiguous seg- 
ment of the sequence) from base i to base j, under the 
condition that base i and base j are paired. These quan- 
tities can be used to calculate the partition function 
Q(j;n) of the substrand from base 1 to base j, under 



The distance along the backbone between two adjacent 
nucleotides is the segment length of the chain. We de- 
note it by I and assign an elastic energy V(r) = I) 2 
per segment, where r represents the end-to-end vector of 
the segment. Instead of attempting (the very cumber- 
some) exact computation of the end-to-end vector dis- 
tribution VF(R; 77i ) of the chain, we employ an asymp- 
totic expression that becomes exact in the limit of large 
m and is sufficiently accurate for our purposes even for 
small to. It can be derived along the line of a sim- 
ilar calculation for the case of the regular (i.e. non- 
elastic) freely jointed chain given in (Flory, 1967). The 



result is conveniently expressed in terms of the quantity 



-h-r-V(r)/fc B T 



r e 



-V(r)/k B T 



where k& 



q(h) = J d 3 r e 

denotes the Boltzmann constant and h is a vector of 
length h with fixed (but arbitrary) orientation in space. 
The asymptotic expression is then 

W(R;m)^C^[ q (h)} m e- hR , (2) 

where C is a normalization constant and h is determined 
from R = m -^-\ogq(h). We incorporate the effect of 



2 Here, the constant A = 3 accounts for the fact that each stem 
branching from the exterior single strand contributes an addi- 
tional segment, whose length is approximately equal to the length 
of three single stranded bases. 

3 Self avoidance in the exterior single strand may be neglected, 
again because of its highly stretched state. 
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FIG. 2 (a) Force-extension curve (FEC) for a group I intron 
(solid line, see text for details) and a homopolymeric RNA of 
the same length, iV=251 (dashed). The depicted secondary 
structure is the minimum free energy structure at R — lOnm. 
(b) FEC for a hairpin composed of randomly chosen basepairs 
(solid) and a homogeneous hairpin of AU-basepairs (dashed) . 
In both cases the total sequence length is N—252. (c) Mean 
number of exterior stems, n s tcm(-R), for the group I intron. 



a Kuhn length b > I by rescaling the end-to-end vector 
distribution through I — > b and m — > ml/b. 

Observables. Apart from the force at fixed extension, 
which is calculated from Eq. (Jl]) by 



f(R) = -k B T-^\ogZ N (R) 



(3) 



(see, e.g. ( Flory, 1967] )), we also calculate the mean 
number of stems, n stcm , along the exterior chain (for the 
structure depicted in Fig. |] this would be n stcm = 2). This 
may be determined by introducing an extra free energy 
penalty, e s tom, for each external stem into the calcula- 
tion of Qn (m) and then differentiating numerically with 
respect to e stC m, i-e., 



d 



de s 



\ogZ N (R) 



£stem=0 



Choice of Parameters. We work at room temper- 
ature, T = 20°C, and use the DNA polymer parameters 
obtained by Montanari and Mezard (2000| ) by fitting to 



the experiment of Maier et al. (2000| ) also for RNA, since 
we are not aware of the corresponding experimental data. 
(We do not expect a large difference in the single strand 
properties between DNA and RNA, because of the high 
similarity between their chemical structures.) The values 
are J = 0.7nm, &=1.9nm, and (K/feT)" 1 / 2 = O.lnm. We 
take the free energy parameters for RNA secondary struc- 
ture as supplied with the Vienna package. The salt con- 
centrations at which these free energy parameters were 
measured are [Na + ] = 1M and [Mg ++ ] = 0M. 



III. RESULTS 

Fig. ||a and b show the FEC's (solid lines) for two 
RNA sequences with practically the same total length 
and composition, both computed as described in the last 
section using the same set of parameters. Strikingly, 
the first curve is almost completely smooth with no sig- 
nificant features, while the second is extremely jagged 
with large 'jumps' in the force. This dissemblance is en- 
tirely due to the difference between the secondary struc- 
tures into which the two sequences fold. The sequence 
in Fig. ||a originates from the group I intron of the me- 
thionine tRNA of Scytonema hofmanii with a sequence 
length of N = 251 (GenBank# U10481). Its dominant 
secondary structure (according to our algorithm 4 ) at an 
extension of i?=10nm is also depicted in Fig. ||a. The 
sequence in Fig. ||d was artificially generated by con- 
catenating a randomly chosen sequence with its reverse 
complement, so that it folds into a single hairpin com- 
posed of random basepairs. Its FEC is very similar to 
the experimental force curve obtained upon unzipping 
double stranded DNA by Essevaz-Roulet et al. (1997); 



the sawtooth-like oscillations correspond to a 'molecular 
stick-slip process' (Bockclmann et al., 1997). 

Why does the group I intron not display an abundance 
of features in the FEC like the hairpin does? Its sec- 
ondary structure consists of many structural elements 
(e.g. stem-loop structures), the opening of which one 
might expect to produce clear signatures in the FEC. 



4 The known native secondary structure of this sequence contains 
two helical regions forming a pseudoknot. Since pseudoknots 
are excluded from our approach (as explained above), we re- 
moved it from the structure computationally by replacing 6 base- 
pairs in the less stable of the two helical regions (positions no. 
79-84 and 157-162) by artificial bases which are excluded from 
base pairing. With this modification, the minimum free energy 
structure at zero force (as determined by the Vienna package) 



is almost identical with the se 



known from 

comparative sequence analysis f Gutell et al., 2001, available at 



.onHa.ry structure 



http:/ /www.rna.icmb.utexas.edu7) outside of the pseudoknot re- 



gion. Beyond the distance at which the pseudoknot is pulled 
apart, our modification of the sequence should not effect the FEC 
significantly. This expectation is supported by our numerical ob- 
servation that the FEC's for the unmodified sequence (ignoring 
the pseudoknot) and for our modified sequence become close to 
identical beyond a distance of R fa 70 nm. 
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FIG. 3 The problem of RNA pulling (in the fixed-distance en- 
semble) may be mapped onto the statistical mechanics prob- 
lem of a particle with a spring attached to it moving in a 
one-dimensional disordered potential. The other end of the 
spring is externally controlled and slowly advanced into one 
direction. 



Indeed, in their theore tical study of force-induced d enat- 
uration of DNA/RNA, Thompson and Siggia (1995) con- 
cluded that the opening of individual basepairs in double 
stranded DNA cannot readily be observed, but the open- 
ing of stem-loop structures in RNA should be. 

One fairly obvious effect that could cause the smooth 
FEC is thermal superposition of alternative secondary 
structures. Since one may expect that typical RNA struc- 
tures (such as the one depicted in Fig. ga) are less well- 
designed than a perfect hairpin, force-induced denatura- 
tion should make more alternative structures accessible 
in the former case than in the latter. In our analysis 
below, we find that this effect is indeed non-negligible, 
but the largest loss of features originates from another, 
more subtle mechanism, which we call the 'compensation 
effect', and which persists even when no alternative sec- 
ondary structures are allowed. The compensation effect 
depends on the fact that when several structural elements 
are pulled at in parallel, the optimization process that de- 
termines the minimum free energy structure with a given 
number m of external open bases may reclose stretches 
of basepairs which had already been opened at a lower 
value m' < m. 

In our approach (see 'Model and Methods' above), 
the information on the secondary structure energetics 
for a given sequence is entirely contained in the function 
<2(m). With the help of the polymer model (contained 
in W(R;m)) this information is translated into a FEC 
via Eq. (Q). Our investigation therefore comprises two 
steps. First, we seek to understand what property of 
Q(m) determines the size of the fluctuations in the FEC, 
and second, how this property depends on the secondary 
structure. 

The first question is addressed most readily for the spe- 
cial case of the random hairpin of Fig. |^b. It is known 
that in the fixed- force ensemble, unzipping of a random 
hairpin may be mapped onto the problem o f a particle 



in a tilted one-dimensional random potential (dc Gcnncs 
197ffl; Lubcnsky and Nelson, 200C| ). The random poten- 



semble, we may perform a very similar mapping 5 (see 
Fig. |). Here, the bias for the direction of movement of 
the particle is not caused by a tilt of the potential, but 
instead by a spring that is attached to the particle. The 
position of the other end of the spring is externally con- 
trolled, i.e. it is determined by R, the given end-to-end 
distance of the RNA molecule. 

In the following, we review the relation between the pa- 
rameters of the particle-in-a-random-potential problem, 
i.e. the spring constant 7 and the variance of the ran- 
dom potential, and the parameters of the unzipping prob- 
lem. This will also serve us to introduce our notation for 
the subsequent discussion. We may write the free en- 
ergy G(m) = —kgT log Q(m) of the random hairpin as 

G{m) = — Y^iLi™ 1 v(i)i where the r](i) are random with 
mean (-q) = e and variance (r/(i)r](j)) — {rf) 2 = Sij(Ae) 2 . 
Here, e represents the mean binding energy per base, 
which depends on the GC-content of the hairpin, the 
temperature, and the salt concentrations, and Ae mea- 
sures the fluctuations of e, both along a given hairpin and 
between different realizations of the random sequence. 
The difference between two free energies that are £ units 
apart, AG(£) = G(m) — G(m — £), then has the variance 



var(AG(£)) =£{Asf 



(4) 



In the particle picture (see Fig. |^), m — £ corresponds 
to the position of the particle, and m to the position of 
the other end of the spring. For fixed m, the particle 
therefore sees the effective potential 



AG{£) + jf 



(•5) 



i.e. Eq. (|I|) determines the variance of the random po- 
tential. The spring constant 7 is determined by e as 
follows. If Ae were zero, the unzipping force would take 
a constant value /o (cf. the dashed line in Fig. which 
shows the FEC of a homogeneous AU-hairpin). The de- 
pendence of fo on e can be calculated analytically by 
evaluating the sum in Eq. (|l|) by the saddle point method 
(see also (D.K. Lubensky and D.R. Nelson, in prepara- 
tion)). The result is shown in Fig. [| (solid line). Now 
7 = l 2 T, where T is the local spring constant of a non- 
binding RNA of m bases at force /o(e). Since the spring 
constant of a homopolymer scales with the inverse of the 
number of segments, we write T = To/m, where Tq de- 
pends only on /o, but not on m. Graphically, To(/o) is 
the slope at / = /o of the dashed line in Fig. ga (FEC of 
a homopolymeric RNA), multiplied by 251 (the number 
of bases in that example). In this way Fo(/o) may also 
be determined from an experimental FEC. 



tial is correlated and has the statistical properties of a 
one-dimensional random walk. In the fixed-distance en- 



For these mappings, alternative structures of the hairpin se- 
quence are neglected, which is a good approximation due to the 
perfect design of the hairpin. Also, the nearest-neighbour corre- 
lations in the random potential caused by the stacking energies 
are not taken into account, since they would not change the qual- 
itative predictions of the model. 
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FIG. 4 Threshold force /o for unzipping of a homogeneous 
hairpin as a function of the binding energy per base e (solid 
line). The dashed line indicates the Gaussian approximation 
fo — (6fcsT e/lb) 1 ^ 2 , which is obtained by using the end- 
to-end distance distribution W(R;m) of a Gaussian chain. 
Note that the Gaussian approximation breaks down already 
at low forces, and the more detailed treatment according to 
Eq. (g) is necessary. The stacking energy for AU-pairs at 
T = 20°C is 2e~1.21 kcal/mol corresponding to a threshold 
force /o ~ 11 pN, which agrees with the value observed in 
Fig. ^b (dashed line). 

When the fluctuations in the random potential are not 
too weak, the particle follows the other end of the spring 
in discrete jumps. The typical size of a jump, A^j ump is 
given by the value of i for which the two terms in Eq. (|^) 
are of equal size, A£j ump ~ (2m Ae/l 2 T ) 2 ^ 3 . A typical 
jump then leads to a drop in the force by 5f ~ Tl A£j ump , 
i.e. 

a/~ (4r Ae 2 /i?) 1/3 ■ (6) 

This is valid as long as the thermal broadening of the 
particle position, A£t — (2m/l 2 T p) 1 ^ 2 , is less than the 
typical jump size A£j ump . In the opposite case, the par- 
ticle is sliding more or less smoothly, and Sf oc Ae. 

Eq. (^|) furnishes an estimate for the size of the fluc- 
tuations in the FEC for the case of a random hairpin. 
However, since we used an arbitrary function G(m) as 
input, the above argument may be made in general for 
any structure, as long as Eq.(^) holds sufficiently well. 
Alternatively, if for a particular structure the dependence 
of var(AG(^)) on I is determined numerically, this could 
be used to replace Eq. (|j) and Eq. (||) would have to be 
modified accordingly. 

We now address the question of how the fluctuations 
in G(m) depend on the secondary structure. An essen- 
tial difference between unzipping of a hairpin and force- 
induced denaturation of a typical RNA structure is that 
in the latter case, several stems are being pulled on simul- 
taneously 6 for most of the extension interval (see Fig. ||c, 



In principle, a situation where several stems are pulled on in 




R [nm] 



FIG. 5 Force-extension curves for 1, 3, and 8 hairpins 
with random basepair composition in a row (sequence length 
iV = 1000; the middle and upper curves are vertically shifted 
by 15 and 30 pN respectively). Clearly, the fluctuations in 
the force curve decrease with increasing number of hairpins, 
except for the last third of the extension interval, where some 
of the hairpins of the 8 hairpin curve have already completely 
disappeared. In our analysis described in the main text only 
the first two thirds of all FEC's were used. The decrease of 
the force fluctuations with increasing extension is due to the 
entropic elasticity of the exterior single strand as described 
by the /^-dependence in Eq. (H). 



which shows the number of stems as a function of the 
extension for the group I intron studied above). To ana- 
lyze the effect of multiple stems, we constructed artificial 
sequences that form a given number n of random hair- 
pins in a row (i.e. the sequences are a concatenation of n 
random hairpin sequences, each of which is constructed 
as explained above). For each n in the range 1 < n < 10, 
we computed G(m) and the FEC's for 1000 different se- 
quence realizations, all with an approximate total length 
of iV = 1000. As an example, Fig. |s| shows the FEC's for 
three sequences, which fold into n = l, 3, and 8 hairpins, 
respectively. Clearly, the fluctuations in the force curve 
decrease with increasing n. We obtained var(AG(£)) as 
an average over the 1000 realizations and a small interval 
of m. Some of the resulting curves are shown in Fig. @. 



parallel can also arise in the process of unzipping a single long 
hairpin, due to accidental palindromic regions in the single strand 
which has already been pulled out. However, these non-native 
interactions have to overcome the energetic advantage of the na- 
tive single-hairpin interactions, in order for the effect to become 
relevant. Hence the palindrome needs to be extremely GC-rich. 
For a single hairpin consisting of random basepairs, we estimated 
that a non-negligible palindrome would typically occur only in 
sequences of at least several thousand bases in length, which is 
beyond the length of the sequences studied here. 
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FIG. 6 The variance of AG(£) for different numbers of hair- 
pins. 
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Dependence of Ae 2 on the number of hairpins (cir- 



Although the dependence of var(AG(£)) on £ is not com- 
pletely linear, the deviation from linearity over the small 
range of ^-values relevant here (typically, < £ < 12) is 
not very large. For the sake of simplicity, we chose to in- 
terpret the data with the theory for a linear var(AG(^)) 
developed above. To this end, we define an effective Ae 
for each n from the slope of var(AG(^)) at £ = 4. 

Fig. shows that Ae 2 decreases monotonically with the 
number of stems that are being pulled on simultaneously. 
This decrease is almost entirely due to the compensation 
effect, which we may intuitively understand as follows. 
When a single hairpin is being unzipped, the stick-slip 
process described in (Esscvaz-Roulet et al., 1997) is topo- 
logically inevitable, since the basepairs have to be opened 
in the order they occur. A strongly bound region that 
is followed by a weakly bound one, then always leads to 
a rise and subsequent drop of the FEC. However, with 
several hairpins, only the total number of exterior open 
bases is externally constrained, while the individual hair- 
pins may freely open and reclose basepairs (for equilib- 
rium FEC's there is no kinetic constraint). Therefore, if 
in a particular hairpin a strongly bound region is followed 
by a weakly bound one, both regions can open together 
and another hairpin can reclose a few basepairs to com- 
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FIG. 8 Scaling plot of the force fluctuations against the 
free energy fluctuations. The dashed vertical line marks the 
crossover region between the jumping regime and the sliding 
regime. The solid line is a linear fit to the data with abscissae 
larger than two. It confirms the scaling behavior expected for 
the jumping regime. See text for details. 



pensate for the released single-strand. Obviously, with a 
growing number of hairpins this mechanism will be in- 
creasingly effective. Clearly, in the fixed-force ensemble, 
the compensation effect is equivalent to an average over 
the FEC's of the individual hairpins. Moreover, with a 
large number of hairpins, the fixed- force and the fixed- 
distance ensembles become equivalent (D.K. Lubensky 
and D.R. Nelson, in preparation). 

To analyze the force fluctuations quantitatively, we 
calculated the FEC's for all of the 1000 sequence real- 
izations of the n parallel hairpins, and defined Af(R) 
as the standard deviation of the force at extension R 
(the so-defined A/ is smaller than the typical size of a 
force jump, Sf, but should have the same scaling be- 
havior). Fig. U shows a plot of the force fluctuations 
against the free energy fluctuations, where the horizon- 
tal axis, Ae(2p 3 R/l 3 T ) 1 / 4 = (A^ ump /A£ T ) 3/2 , is scaled 
such that it separates the jumping regime from the slid- 
ing regime at a crossover value of one. The vertical axis is 
scaled such that the data should collapse onto a straight 
line in the jumping regime according to Eq. (^). In or- 
der to guide the eye, Fig. || also displays artificial data 
(crosses) for which G(m) was generated by drawing ran- 
dom numbers rj(i) and taking G(m) = — ^j_7 rj(i) (the 
different points are for different values for the mean and 
variance of rj{i)). The circles mark the data points for 
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FIG. 9 Force-extension curves for the group I intron under 
different conditions. Curve (a) is a copy of the full thermody- 
namic curve of Fig. ^a. Curve (b) (vertically shifted by 7pN) 
was calculated by taking only the minimum free energy struc- 
tures along the unfolding pathway into account, i.e. the ther- 
mal smoothening due to suboptimal structures is suppressed. 
For curve (c) (vertically shifted by 14pN), the rebinding of 
basepairs that had already opened at smaller extension has 
also been suppressed, in order to simulate non-equilibrium 
pulling. 



the parallel hairpins and the rectangular symbol in the 
lower left indicates in what region the group I intron is 
situated 7 . 

For the artificial data (crosses) the above scaling ar- 
guments should apply rigorously. Indeed, the artificial 
data falls onto a straight line in the jumping regime (the 
solid line represents a linear fit to the points with ab- 
scissae larger than two) and in the sliding regime A/ is 
proportional to Ae (not shown) . For the real data, Fig. || 
shows that passing from a single hairpin through struc- 
tures with several parallel perfect hairpins to a typical 
natural RNA may be viewed as passing from the jump- 
ing regime to the sliding regime for a particle in a (cor- 
related) random potential. At the same time, the FEC's 
change from jagged to smooth. 

As mentioned above, thermal superposition of al- 
ternative secondary structures also contributes to the 
smoothening of the FEC's: since the structural elements 
in each suboptimal structure open at different values of 
m, the thermal average over all these structures smoothes 
G(m). In order to assess the importance of this effect, 
we suppressed it by taking only the minimum free energy 
secondary structures into account instead of calculating 



the full partition function Q(m). For the group I intron, 
the FEC without the contribution of suboptimal struc- 
tures is shown in Fig. ^o. Compared to the full ther- 
modynamic curve (shown in Fig. ^a), some structure is 
gained, but not nearly as much as in the FEC for the ran- 
dom hairpin of the same length, Fig. ||b. This indicates 
that the compensation effect is the dominant source for 
the smoothing of the FEC. 



IV. DISCUSSION 

In the last section, we found that the equilibrium 
FEC's for typical RNA molecules (like the group I in- 
tron that served us as an example) are quite smooth and 
do not reveal any features that can be associated with the 
opening of structural elements. The compensation effect 
is the primary cause for this result, and we expect it to 
be responsible, in part, also for the experimental observa- 
tion of extremely smooth FEC's for single-stranded DNA 
by Maier et al. (2000| ). Nevertheless, the measurement 



7 The rectangular area marks the range of points that we obtained 
by determining <5/, Ae, and To by averaging over different exten- 
sion intervals, all within the range 50-110 nm, which is a region 
where the mean force is relatively constant (this is required in 
order to separate fluctuations in the force from a gradual change 
in the mean value). 



of equilibrium FEC's for RNA or single-stranded DNA 
might still be useful, e.g. for an experimental determi- 
nation of the RNA/DNA free energy parameters. Usu- 
ally, these are extra cted from melting curves of oligomers 
(Freier et al., 1986), which requires variation of the tem- 
perature away from the temperature of interest up to 
the melting point of the oligomers, where the free en- 
ergy and its temperature derivative are determined. The 
free energy parameters at the temperature of interest are 
then obtained by extrapolation, which introduces an er- 
ror inherent to the method. For pulling experiments, 
the temperature can be kept constant at the value of 
interest, which is an obvious advantage. Here, the lim- 
iting factor is only the precision of the force measure- 
ment. The quantitative relationship between stacking 
energy and threshold force expressed by Fig. ^ furnishes 
the necessary link between force and energy. Measuring 
FEC's for periodic hairpins composed of different build- 
ing blocks, would lead to curves like the dashed line in 
Fig. |^b with different values for the threshold force. From 
these values, the stacking energies could then be deter- 
mined, which might lead to more accurate parameters at 
the desired temperature and salt concentrations. 

There are (at least) two options to obtain FEC's with 
more features, which in turn might allow one to obtain 
information on RNA secondary structure from pulling 
experiments. One could either record non- equilibrium 
FEC's or analyze the fluctuations around the equilibrium 
curve. For our theoretical investigation, the latter option 
is not available as long as we work in the fixed-distance 
ensemble, since the force fluctuations around the ther- 
modynamic average diverge in that ensemble. We will 
pursue this option in a separate publication by working 
in a mixed ensemble (Gerland, U., R. Bundschuh, and 
T. Hwa, in preparation). Here, we briefly consider non- 
equilibrium FEC's, where the rate of external increase 
in the force/extension is higher than (some of) the rates 
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FIG. 10 Sketch of the assumed pathway for the formation 
of a stem-loop structure in the presence of a stretching force 
/. A generalized reaction coordinate x is plotted along the 
horizontal axis and the free energy G along the vertical axis. 
The work that has to be exerted against the force in order 
to pull in the single strand needed for the formation of the 
stem-loop structure is denoted by AW . In principle, the en- 
tropy difference between the random coil state on the left 
and the transition state also contributes to the barrier height, 
however, we assume that at typical stretching forces it is neg- 
ligible compared to AW. 



associated with internal rearrangements in the secondary 
structure. In the case of long proteins, eith er naturally 



occurring as an array of globular do mains (Ricf et al 
199^) or synthesized protein arrays ( Yang, 200C ), me- 



chanical stretching experiments resolved the unfolding of 
up to 20 individual domains. These experiments were 



performed under non-equilibrium conditions (Rief et al. 
199$) with typical pulling speeds of l^tm/s. 

In order to estimate whether non-equilibrium condi- 
tions are attainable for RNA with reasonable pulling 
speeds, we need a rough idea of the timescales involved 
in secondary structure rearrangements of RNA. For this, 
we again assume that RNA and single-stranded DNA be- 
have similarly, so that we may draw on an experiment by 
Bonnet, Krichevsky, and Libchaber (Bonnet et al., 1998) 
measuring the opening and closing rates of DNA stem- 
loops using fluorescence correlation spectroscopy. From 
their results, we extract 10 [is as an estimate for the clos- 
ing time (at T = 20°C) of a stem-loop structure with three 
basepairs and a loop of four nucleotides, which may be 
considered as a minimal secondary structure element. We 
expect that the formation of the stem-loop takes place in 
a single step whose reaction pathway goes through a tran- 
sition state where the basepairs of the stem have not yet 
formed, but the corresponding bases are already closely 
together (see Fig. |l0|). In the presence of an external 
force, the closing time must then be multiplied with an 
Arrhenius factor e AW 7 fc B T ; where AW is the work that 
has to be exerted against the force to pull in the amount 
of single strand ne eded for the formation of the stem-loop 
( Ricf et al., 1998| ) . With a typical force of 6 pN we ob- 
tain AM /r «4kcal/mol, which results in a closing time on 
the order of 10 ms. This timescale has to be compared 
to the time it takes to stretch out the stem-loop. At a 



pulling speed on the order of l/zm/s, the two timescales 
are comparable and hence, both the formation of new sec- 
ondary structure elements and the restoration of already 
opened ones are likely to be suppressed 8 . Although it is 
beyond the scope of this paper, we want to note that in 
the presence of pseudoknots and/or tertiary interactions, 
the formation or re-formation of structural elements is 
expected to be slowed down even further, due to long 
search times for the interaction partners. 

To obtain an impression of how many features a non- 
equilibrium FEC might show for the group I intron, wc 
change our equilibrium algorithm, such that the rebind- 
ing of bases is disabled once they have been unbound, 
and include only the contribution of the minimum free 
energy structures instead of all possible secondary struc- 
tures. This is clearly a very crude approximation. In a 
proper treatment, only those kinetic processes whose en- 
ergy barrier is higher than a certain threshold as deter- 
mined by the pulling speed should be suppressed. Also, 
we did not account for the fact that the opening of base- 
pairs occurs at higher forces i n non-equilibrium as a con - 
sequence of Kramers theory ( Evans and Ritchie, 1997 ). 
Nevertheless, the FEC shown in Fig. |9|c gives an idea of 
the large number of structural transitions that take place 
during force-induced denaturation (for comparison, the 
equilibrium FEC is shown again in Fig. ||a). We there- 
fore believe that non-equilibrium stretching experiments 
of RNA could lead to interesting and useful results. 

We made most of the software tools developed for the 
present work avai lable to the public by creating a 'RNA 
pulling server' at http : / /bioinf o . ucsd. edu/RNA . 
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